-
Notifications
You must be signed in to change notification settings - Fork 457
Feat: Outline import #1478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feat: Outline import #1478
Conversation
…kend:\n- Add POST /api/v1.0/outline_import/upload (zip)\n- Parse .md files, create doc tree from folders, rewrite image links to attachments, convert to Yjs via Y-provider\nFrontend:\n- Add /import/outline page with zip file picker + POST\n- Add menu entry 'Import from Outline' in left panel header\n- Add minimal i18n keys (en, fr)
…ous forbidden\n- Authenticated happy path with local image and mocked conversion
…import.py and call from view\n- Keep view thin; service handles zip, images, conversion, attachments\n- Fix imports accordingly
…ject unsafe paths)\n- Ignore __MACOSX and hidden entries\n- Service unit tests (happy path + zip slip)\n- Change API path to /imports/outline/upload and update front + tests
…kNote elements - Convert H4/H5/H6 headings to compatible formats (H4→H3 with marker, H5→bold with arrow, H6→paragraph with bullet) - Convert horizontal rules (---, ***, ___) to [DIVIDER_BLOCK] markers - Preserve task lists formatting for proper checkbox rendering - Add comprehensive unit tests for all conversion cases This ensures Outline exports with all 6 heading levels and other markdown features are properly imported into BlockNote.js which only supports 3 heading levels.
…ted BlockNote elements" This reverts commit b7a7663.
Resolved conflict in translations.json by keeping Outline import translations
- Add CSRF token to Outline import upload request - Fix content save by removing invalid update_fields parameter - Handle nested documents properly to avoid duplicates when a document has child documents (e.g., Doc.md with Doc/ directory)
|
In progress:
|
|
|
||
| def _upload_attachment(user, doc: models.Document, arcname: str, data: bytes) -> str: | ||
| """Upload a binary asset into object storage and return its public media URL.""" | ||
| content_type, _ = mimetypes.guess_type(arcname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the library underneath using? Mimetype guessing is not always stable (even when using libmagic it can differ from version/environment).
I would suggest good testing, preferably in different environments if possible.
| parts = [p for p in name.split("/") if p] | ||
| if any(part == ".." for part in parts): | ||
| raise OutlineImportError("Unsafe path in archive") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this always unsafe or only when .. goes beyond the root you are iterating over?
| uploaded = request.FILES.get("file") | ||
| if not uploaded: | ||
| raise drf.exceptions.ValidationError({"file": "File is required"}) | ||
|
|
||
| name = getattr(uploaded, "name", "") | ||
| if not name.endswith(".zip"): | ||
| raise drf.exceptions.ValidationError({"file": "Must be a .zip file"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should rely on a drf serializer here to validate the input instead of doing it in the view. You can maybe reused the FileUploadSerializer present in the serializer module (src/backend/core/api/serializers.py)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then once the input validated you have to rely on the malware_detection feature to validate the zip content. But we have to imagine a workflow, this process is async. Once the malware detection ended, the process_outile_zip should start.
| # Fail fast if the upload is not a valid zip archive | ||
| with zipfile.ZipFile(io.BytesIO(content)): | ||
| pass | ||
| created_ids = process_outline_zip(request.user, content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggestion to save the uploaded zip on the bucket storage (you can rely on the django storage API). Doing this you can create a celery task to process the file in an async way
| return f"{settings.MEDIA_BASE_URL}{settings.MEDIA_URL}{key}" | ||
|
|
||
|
|
||
| def process_outline_zip(user, zip_bytes: bytes) -> list[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the previous comment made, asking to save the file on the bucket, you can transform this function in a celery task and execute it asynchronously
| models.DocumentAccess.objects.update_or_create( | ||
| document=doc, | ||
| user=user, | ||
| defaults={"role": models.RoleChoices.OWNER}, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| models.DocumentAccess.objects.update_or_create( | |
| document=doc, | |
| user=user, | |
| defaults={"role": models.RoleChoices.OWNER}, | |
| ) |
You have to define an owner access for the user only on the root document. Then the children will inherit from this access.
| creator=user, | ||
| title=part, | ||
| link_reach=models.LinkReachChoices.RESTRICTED, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ) | |
| ) | |
| models.DocumentAccess.objects.create( | |
| document=doc, | |
| user=user, | |
| role=models.RoleChoices.OWNER, | |
| ) |
| if parent_doc is None: | ||
| doc = models.Document.add_root( | ||
| depth=1, | ||
| creator=user, | ||
| title=title, | ||
| link_reach=models.LinkReachChoices.RESTRICTED, | ||
| ) | ||
| else: | ||
| doc = parent_doc.add_child(creator=user, title=title) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if parent_doc is None: | |
| doc = models.Document.add_root( | |
| depth=1, | |
| creator=user, | |
| title=title, | |
| link_reach=models.LinkReachChoices.RESTRICTED, | |
| ) | |
| else: | |
| doc = parent_doc.add_child(creator=user, title=title) |
This is managed in _ensure_dir_documents function. You will probably have duplicated docs at the end
| models.DocumentAccess.objects.update_or_create( | ||
| document=doc, | ||
| user=user, | ||
| defaults={"role": models.RoleChoices.OWNER}, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| models.DocumentAccess.objects.update_or_create( | |
| document=doc, | |
| user=user, | |
| defaults={"role": models.RoleChoices.OWNER}, | |
| ) |
Managed in the _ensure_dir_documents function
| extra_args = { | ||
| "Metadata": { | ||
| "owner": str(user.id), | ||
| "status": enums.DocumentAttachmentStatus.READY, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "status": enums.DocumentAttachmentStatus.READY, | |
| "status": enums.DocumentAttachmentStatus.PROCESSING, |
Purpose
Add Outline import functionality to allow users to migrate their documentation from Outline by uploading a .zip export file.
Proposal
External contributions
Thank you for your contribution! 🎉
Please ensure the following items are checked before submitting your pull request:
git commit --signoff(DCO compliance)git commit -S)<gitmoji>(type) title description## [Unreleased]section (if noticeable change)